A Supervised Learning based Chunking in Thai using Categorial Grammar

نویسندگان

  • Thepchai Supnithi
  • Chanon Onman
  • Peerachet Porkaew
  • Taneth Ruangrajitpakorn
  • Kanokorn Trakultaweekoon
  • Asanee Kawtrakul
چکیده

One of the challenging problems in Thai NLP is to manage a problem on a syntactical analysis of a long sentence. This paper applies conditional random field and categorical grammar to develop a chunking method, which can group words into larger unit. Based on the experiment, we found the impressive results. We gain around 74.17% on sentence level chunking. Furthermore we got a more correct parsed tree based on our technique. Around 50% of tree can be added. Finally, we solved the problem on implicit sentential NP which is one of the difficult Thai language processing. 58.65% of sentential NP is correctly detected.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Weakly-Supervised Bayesian Learning of a CCG Supertagger

We present a Bayesian formulation for weakly-supervised learning of a Combinatory Categorial Grammar (CCG) supertagger with an HMM. We assume supervision in the form of a tag dictionary, and our prior encourages the use of crosslinguistically common category structures as well as transitions between tags that can combine locally according to CCG’s combinators. Our prior is theoretically appeali...

متن کامل

Translating Treebank Annotation For Evaluation

In this paper we discuss the need for corpora with a variety of annotations to provide suitable resources to evaluate different Natural Language Processing systems and to compare them. A supervised machine learning technique is presented for translating corpora between syntactic formalisms and is applied to the task of translating the Penn Treebank annotation into a Categorial Grammar annotatio...

متن کامل

Reining in CCG Chart Realization

We present a novel ensemble of six methods for improving the efficiency of chart realization. The methods are couched in the framework of Combinatory Categorial Grammar (CCG), but we conjecture that they can be adapted to related grammatical frameworks as well. The ensemble includes two new methods introduced here— feature-based licensing and instantiation of edges, and caching of category comb...

متن کامل

Deep multi-task learning with low level tasks supervised at lower layers

In all previous work on deep multi-task learning we are aware of, all task supervisions are on the same (outermost) layer. We present a multi-task learning architecture with deep bi-directional RNNs, where different tasks supervision can happen at different layers. We present experiments in syntactic chunking and CCG supertagging, coupled with the additional task of POS-tagging. We show that it...

متن کامل

KUL-Eval: A Combinatory Categorial Grammar Approach for Improving Semantic Parsing of Robot Commands using Spatial Context

When executing commands, a robot has a certain level of contextual knowledge about the environment in which it operates. Taking this knowledge into account can be beneficial to disambiguate commands with multiple interpretations. We present an approach that uses combinatory categorial grammars for improving the semantic parsing of robot commands that takes into account the spatial context of th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010